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ABSTRACT 

The  use  of  residuals  to  test  the  assumption  of  normality  of  the  errors  in 
a  linear  model  is  considered.  Standard  tests  for  normality  typically  require 
an  assumption  of  independence;  however  the  residuals  are  correlated.  An 
investigation  of  the  Shapiro-Wilk  test  shows  that  it  is  affected  by  these 
correlations,  but  the  problem  can  be  overcome  by  a  simple  adjustment  to  the 
test  procedure. 
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significance  and  explanation 


In  a  regression  situation,  the  residuals  provide  (correlated  and  often 
unequal  variance)  estimates  of  the  errors  in  the  postulated  model.  We  anticipate 
that,  if  the  errors  are  normally  distributed  as  is  usually  assumed,  the  residuals 
will  show  a  "similar"  behaviour.  Thus  the  departure  from,  or  consonance  with, 
normality  of  the  residuals  is  of  interest.  Standard  tests  for  normality  typically 
require  an  assumption  of  independence;  even  if  the  residuals  are  standardized  to 
be  of  equal  variance,  they  will  be  correlated.  We  discuss  various  test  possibili¬ 
ties  and  then  focus  on  the  well-known  Shapiro-Wilk  test.  Although  it  is  affected 
by  the  correlations,  this  difficulty  can  be  overcome  by  a  simple  adjustment  to 
the  test  procedure. 


Ihe  responsibility  for  the  wording  and  views  expressed  in  this  descriptive  sunary 
lies  with  MFC,  and  not  with  the  authors  of  this  report. 


TESTING  THE  NORMALITY  OF  RESIDUALS 


N.  R.  Draper  and  J.  A.  John 

I.  INTRODUCTION 

Consider  the  linear  model 

jr  -  X0  ♦  e  (1.1) 

where  y  is  an  n  x  1  vector  of  observations,  X  is  an  n  x  p 

matrix  of  specified  predictor  variables,  8  is  a  p  x  1  vector 

of  parameters  and  e  is  an  n  x  1  vector  of  unknown  errors 

assumed  to  be  N(0,  o2I)  .  The  vector  of  residuals  obtained  from 
•»  ** 

a  least  squares  (LS)  fit  of  (1.1)  is  given  by 

e  -  (I  -  R)y  (1.2) 

—  •m  mt  mt 

where  R  ■  X(X'X)-1X  .  Examination  of  these  residuals  to  check 

•*  e  e  •»  • 

the  basic  model  assumptions  is  essential.  With  the  advent  of 
large  scale  confuting,  e  huge  literature  has  grown  up  on  ways  to 
examine  and  test  the  residuals.  For  basic  references  see,  for 
example  Seber  (1977),  Barnett  and  Lewis  (1978),  Belsley,  Kuh  and 
Welsch  (1980),  Draper  and  Smith  (1981)  and  Hawkins  (1980). 

In  this  paper,  the  use  of  residuals  to  test  the  assumption 
of  the  normality  of  errors  is  examined.  There  are  several  tests  for 
normality  in  use,  but  typically  they  require  the  assumption  of 
independence.  These  tests  are  considered  in  the  next  section. 

However,  they  may  not  be  appropriate  when  applied  to  LS  residuals 
since  these  residuals  are  not  independent.  They  are  based  on  (n  -  p) 
rather  than  n  degrees  of  freedom  and  are  correlated,  their  variance 
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covariance  matrix  being  V(e)  •  (I  -  R) o*  .  This  lack  of 

••  •* 

independence  raises  a  number  of  interesting  questions.  To 
what  extent  are  the  standard  normality  tests  affected  by  the 
presence  of  correlations  amongst  the  residuals?  Can  these 
tests  be  amended  or  are  new  tests  necessary?  Alternatively, 
can  a  subset  of  the  (n  -  p)  residuals  or,  more  generally, 

(n  ~  p)  linearly  and  statistically  independent  linear  functions 
of  the  residuals  be  found  to  which  the  standard  tests  can  be 
applied? 

A  number  of  ways  of  transforming  residuals  to  independ¬ 
ence  have  been  proposed  and  some  of  the  methods  are  described  in 
section  3.  For  a  number  of  reasons  we  feel  that  such  transformed 
residuals  are  not,  in  general,  entirely  suitable  for  testing 
normality.  Instead,  an  investigation  into  the  use  of  one  of  the 
standard  tests,  namely  the  Shapiro-Wilk  test,  on  the  ls  residuals 
(1.2)  is  reported  in  the  remaining  sections.  Our  conclusion  is 
that  this  test,  properly  interpreted,  gives  a  suitable  basis  for 
testing  the  normality  of  residuals, 
c 

2.  TESTS  FOR  NORMALITY 

A  number  of  tests  have  been  proposed  to  check  for  normality. 
A  popular  one  that  has  stood  up  well  in  various  comparative  investi¬ 
gations  is  due  to  Shapiro  and  Wilk  (1965,  1968);  see  formula  (4.1). 
Extensive  comparisons  of  the  Shapiro-Wilk  statistic  with  competitors 
were  made  by  Shapiro,  Wilk  and  Chen  (1968).  They  concluded  that  it 
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was  a  "generally  superior  omnibus  measure  of  non-normality". 

Similar  conclusions  transfer  to  a  modification  of  the  Shapiro-Wilk 
test  statistic  for  sample  sizes  exceeding  50,  described  by 
Shapiro  and  Francis  (1972).  Other  investigations  favouring  the 
Shapiro-Wilk  statistic  are  given  in  Dyer  (1974)  and  Huang  and 
Holch  (1974). 

For  samples  of  size  50  or  more,  D'Agostino  (1971) 
suggests  a  statistic  D  which  is  "up  to  a  constant  the  ratio  of 
Downton's  linear  unbiased  estimator  of  the  population  standard 
deviation  to  the  sample  standard  deviation".  The  null  distribution 
of  D  can  be  approximated  by  Cornish-Fisher  expansions. 

D'Agostino  and  Rosman  (1974)  in  an  investigation  of  Geary's 
computationally  simple  test  based  on  the  ratio  of  the  mean  deviation 
to  the  standard  deviation  conclude  that  it  is  a  possibly  useful  test 
but  that  "there  appears  to  be  no  specific  situation  where  Geary's 
test  clearly  and  for  practical  purposes  dominates  all  other  tests...". 

Spiegelhalter  (1977,  1980)  investigates  an  omnibus  test 
for  normality  "based  on  the  posterior  probability  of  the  normal  shape" 
under  various  assumptions  about  the  sampled  population.  Power 
comparisons  with  other  tests  for  n  -  20  and  50  show  "good  overall 
performance  when  n  ■  20"  with  a  drop  in  power  "against  moderately 
asymmetric  alternatives  for  n  ■  50". 

Lin  and  Mudholkar  (1980)  offer  a  test  statistic  Z  which 
makes  use  of  the  fact  that  it  is  only  for  the  normal  distribution 
that  the  mean  and  variance  are  distributed  independently.  This  test, 
against  symmetric  alternatives,  is  summarised  by  Nelson  (1981)  who 
also  provides  a  table  of  critical  values. 
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Martinez  and  Inglevicz  (1981)  suggest  a  test  statistic 
which  is  the  ratio  of  two  estimators  of  variance  and  make  a  power 
comparison  in  which  the  "proposed  test  outperforms  all  considered 
competitors  for  long-tailed  synaetric  alternatives  and  performs 
well  for  all  other  cases  considered".  However,  it  is  not  univer¬ 
sally  superior  to  all  other  tests. 


3.  UHCORgKLATED  TRANSFORMED  RESIDUALS 

The  residuals  e  -  (1  -  R)jr  are  linear  combinations  of 
the  observations  and  are  based  on  n  -  p  degrees  of  freedom. 

Since  I  -  R  is  of  rank  n  -  p  it  is  possible  to  find  an  (n  -  p)  x  n 
matrix  A  of  rank  n  -  p  such  that 


e_  ■  Ae 

is  N  _ (0  ,  a2 I)  .  The  vector  e  thus  consists  of  (n  -  p) 

n-p  -  -a 

uncorrelated  homoscedastic  residuals.  The  matrix  A  necessarily 

satisfies  AX  -  0  and  AA'  -  I  . 

There  are,  however,  many  possible  choices  for  A  .  The 

overall  aim  is  to  make  a  "good"  choice  which  will  facilitate  some 

stated  objective.  The  following  choices  have  been  suggested: 

(a)  Theil  (1965)  proposed  the  use  of  Best  Linear  Unbiased  Scaled 

(BLUS)  residuals.  Let  the  vector  Je  contain  (n  -  p) 

selected  components  of  e  ;  J  is  a  submatrix  of  1  .  Then 

-  -  -n 

BLUS  residuals  are  obtained  by  choosing  A  so  that  the 

expected  sum  of  squares  of  the  discrepancies  in  the  vector 

e„  -  Je  is  minimised,  that  is, so  that 
-a  *•  ’ 


(3.1) 


i 


is  minimised.  For  a  succinct,  excellent  account,  see 
Grossman  and  Sty an  (1972);  alternatively  see  Cent  and  Styan 
(1978).  For  a  program  which  computes  BLUS  residuals,  see 
Farebrother  (1976).  For  related  work,  see  Koerts  (1967)  and 
Abrahamse  and  Koerts  (1971) 

(b)  Tiao  and  Guttman  (1967)  suggested  augmenting  e  in  (3.1)  by 
p  independent  random  variables  z^,  z2f...,z  which  are 
uncorrelated  with  e  and  such  that  E(z)  ■  0  ,  V(z)  -  ozI  . 

If  (K  ,  J)  and  G  are  n  x  n  permutation  and  orthogonal 
matrices  respectively,  then  a  general  form  of  such  residuals 
can  be  written  (see  Dent  and  Styan,  1978)  as 

e  »  G(Je  +  Kz) 

~g  -  —a 

Following  Hildreth  (1971),  the  choice  of  G  to  minimise  the 

trace  of  the  variance  covariance  matrix  of  e  -  e  leads  to 

-g 

the  so-called  Best  Augmented  Unbiased  Scaled  (BAUS)  residuals. 

The  e  residual  vector  is  said  to  be  "unaugmented",  the  e 
-a  -g 

is  "augmented".  For  additional  details  see  Dent  and  Styan  (1978). 

For  properties  of  uncorrelated  transformed  residuals,  see 
also  Godolphin  and  Tullio  (1978). 

Other  alternative  approaches  are  as  follows: 

(c)  Hedayat  and  Robson  (1970)  and  Brown,  Durbin  and  Evans  (1975)  have 
suggested  the  use  of  "stepwise"  (or  recursive)  residuals;  see  also 
Farebrother  (1978).  The  uth  (u  »  p)  such  residual  is  calculated 


as  the  deviation  of  the  uth  observation  from  its  predicted 
value  based  on  a  LS  fit  to  only  the  first  u  observations, 
normalized  to  have  variance  o2  .  The  n  -  p  stepwise 
residuals  are  not  only  mutually  independent  and  homoscedastic 
but  are  also  independent  of  all  the  calculated  regression  functions. 
They  are  not,  of  course,  a  transformed  set  of  overall  LS 
residuals  and  their  vector  is  not  restricted  to  the  same  space 
as  the  vector  e  but  they  are  clearly  linear  combinations  of 
the  observations.  Moreover,  each  stepwise  residual  has  a  clear 
identification  with  one  point  of  the  design*  However,  the  set 
of  stepwise  residuals  obtained  is  entirely  dependent  on  the  order 
selected  for  the  entry  of  the  observation.  Hedayat  and  Robson 
argue  that  the  ordered  fitted  values  might  be  used  to  specify 
the  ordering.  Brown  et.  al.  suggest  the  use  of  such  residuals 
in  time  series  where  there  is  a  natural  ordering. 

(d)  Gentleman  and  Wilk  (1975)  and  John  and  Draper  (1978)  have 
suggested  the  use  of  "adjusted"  residuals  as  an  aid  to  the 
detection  of  outliers.  The  procedure  is  as  follows.  Obtain 
the  residuals  from  a  LS  analysis  of  all  data  points  and  choose 
one  of  them,  normalized  to  have  variance  a2  ,  as  the  first 
adjusted  residual.  Then  the  point  corresponding  to  this  residual 
is  deleted  and  the  procedure  repeated  on  the  reduced  data  set  to 
give  a  second  adjusted  residual;  again  normalized  to  have 
variance  o2  .  A  second  point  is  then  deleted  from  the  data 
set,  and  so  on.  This  procedure  is  repeated  until  n  -  p 


adjusted  residuals  have  been  obtained;  the  remaining  p 
residuals  will  then  be  zero.  John  and  Draper  have  shown 
that  a  missing  value  procedure  can  be  used  as  an  alternative 
to  deleting  observations.  The  resulting  adjusted  residuals 
are  homoscedastic  and  uncorrelated.  Observations  can  be 
selected  for  deletion  in  any  order;  the  order  used  in  outlier 
detection  is  in  terms  of  a  decreasing  modulus  size  of 
residuals  obtained  when  the  observation  for  the  largest 
residual  at  the  previous  stage  is  deleted. 

Note  the  reciprocity  between  stepwise  and  adjusted 
residuals;  they  are  based  on  forward  and  backward  selection 
procedures  respectively. 

The  standard  tests  of  normality ,  discussed  in  the  previous 
section,  could  be  applied  to  any  of  the  uncorrelated  transformed 
residuals  given  above.  For  a  number  of  reasons  it  was  decided  that 
such  procedures  would  not,  in  general,  be  particularly  appropriate. 

The  arbitrariness  of  transformed  residuals  is  one  of  the 
main  drawbacks.  For  testing  normality,  there  appears  to  be  no  reason 
in  general  why  one  particular  A  matrix  in  (3.1)  should  be  preferred 
to  any  other  matrix.  If  the  order  of  observations  is  predetermined, 
as  in  time  series,  then  the  use  of  stepwise  or  adjusted  residuals  may 
be  appropriate.  However,  in  other  cases  the  use  of  such  residuals  is 
dubious  especially  as  the  resulting  residuals  may  not  be  normally 
distributed  even  when  the  error  distribution  is  normal.  This  is 
certainly  the  case,  for  example,  when  adjusted  residuals  are  used 
with  an  ordering  based  on  a  decreasing  modulus  size  of  residuals. 


Further,  the  fact  that  the  residuals  are  uncorrelated 
does  not  imply  that  they  are  independent  in  general,  Huang  and 
Bolch  (1974)  point  this  out  for  BLUS  residuals  and  add  that 
"...  the  lack  of  independence  among  BLUS  residuals  when  the 
disturbances  [errors]  are  not  normal  may  be  as  least  as  great 
as  the  lack  of  independence  among  [ordinary]  LS  residuals".  In 
other  related  work,  Ramsey  (1969)  and  Ramsey  and  Gilbert  (1972) 
investigate  tests  for  detection  of  regression  specification  errors 
such  as  omitted  variables,  incorrect  functional  forms,  simultaneous 
equation  problems  and  heteroscedasticity.  Ramsey  and  Gilbert  note 
that  "...  there  does  not  seem  to  be  any  universal  simple  solution 
to  the  problem  of  choosing  between  [BLUS]  and  [ordinary]  LS 
residuals". 

Another  disadvantage  of  the  transformed  residuals  is 
che  computational  burden  involved  in  calculating  them.  Tests 
for  normality  based  on  such  residuals  would  have  to  enjoy  consid¬ 
erable  benefits  over  tests  using  the  LS  residuals  for  this 
disadvantage  to  be  overcame. 

Hence,  in  this,  paper,  we  have  investigated  the  possib¬ 
ility  of  applying  a  standard  test,  namely  the  Shapiro-Wilk  test, 
to  the  LS  residuals.  As  is  shown  in  the  next  section,  such  a 
test  appears  to  be  appropriate  if  the  test  percentage  points  are 
modified  by  making  a  simple  adjustment.  The  virtues  of  this 
procedure  are  that  it  is  easy  to  carry  out,  needs  no  additional 
tables,  and  its  accuracy  seems  adequate  for  the  situations  we  have 
investigated. 
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4.  SIMULATION  RESULTS  FROM  SELECTED  EXPERIMENTAL  SITUATIONS 


A  simulation  study  was  carried  out  on  a  number  of 
selected  experimental  situations  to  assess  the  appropriateness 
of  using  the  Shapiro-Wilk  test  to  examine  the  normality  of  LS 
residuals.  Table  1  shows  results  obtained  using  two-way  r  x  c 
tables.  N  ■  rc  standard  normal  variates  were  generated  and 
residuals  e^(i  ■  1»  •  •.,  N)  evaluated  assuming  the  usual 
additive  model.  The  Shapiro-Wilk  statistic 

M  "  {\  Vi+,;Vi+i'ei)}2/I  ei2  (4*l) 

was  then  calculated,  where  the  a's  are  constants  given  by  Shapiro 
and  Wilk  (1965).  For  each  r  x  c  table,  this  procedure  was 
repeated  3,000  times.  Note  that,  in  general,  standardized 
residuals  should  be  used  in  (4.1)  to  give  uniform  variances,  but 
this  is  not  necessary  here  since  all  residuals  are  estimated  with 
equal  precision.  As  a  check  on  the  procedure  and  calculations, 
the  Shapiro-Wilk  statistic  was  also  calculated  for  the  errors 
themselves.  The  5Z  and  10Z  percentage  points  of  the  W  statistic 
for  N  observations,  given  in  Shapiro  and  Wilk  (1965),  were  then 
used  to  obtain  the  proportion  of  sampled  W  values  falling  in  the 
corresponding  tail  area.  For  the  errors  these  proportions 
should  be  exactly  5%  and  10Z;  any  discrepancies  reflecting  sampling 
error.  It  can  be  seen  from  Table  1  that  these  results  are  satisfactory. 
For  the  residuals  e^  on  the  other  hand  the  proportions  are  clearly 
too  small.  The  tables  of  percentage  points  can,  however,  be 


examined  to  determine  what  value  of  N  should  have  been  used 


in  order  to  produce  the  correct  tail  areas.  In  table  1,  these 
values  of  N  which  lead  to  the  correct  5Z  and  10Z  points  are 
denoted  by  and  respectively.  This  means  that  for 

the  3x5  two-way  table,  for  example,  if  the  Shapiro-Wilk  test 
had  been  applied  at  the  5Z  level  with  •  18,  instead  of  N  ■  15  , 
an  appropriate  test  would  have  been  made.  Similarly,  ■  17.5 

would  have  been  appropriate  for  a  10Z  level.  Mote  that  >  N and 
>  N for  all  cases  in  Table  1. 

Table  2  shows  a  parallel  set  of  figures  for  the  residuals 
from  main  effect  models  for  various  types  of  factorial  designs 
as  indicated  by  the  first  column.  Again  N^>.  Nand  >  Min  most 
but  (perhaps  surprisingly)  not  all  cases.  Table  3  examines  the 
case  where  two  factor  interactions  are  estimated  as  well,  thus 
further  reducing  the  residual  degrees  of  freedom.  The  results  are 
in  line  with  the  previous  tables  except  for  the  cases  22  x  4  , 

2Z  x  5  and  22  x  6  which  were  found  to  be  completely  anomalous; 
the  3,000  simulations  produced  many  more  small  W  values  than 
did  neighbouring  cases.  Close  examination  of  these  anomalous  cases 
revealed  no  assignable  cause  for  this  puzzling  and  atypical  behaviour. 
Additional  computions  were  made  for  the  22  x  7  and  22  x  8  cases 
and  these  confirmed  the  anomaly,  as  shown  in  Table  3.  These  cases 
are  excluded  from  the  discussion  in  the  next  section. 

Table  4  provides  parallel  results  and  a  broadly  similar 

pattern  for  a  series  of  rotatable  response  surface  designs.  These 

k  k-1 

consist  of  a  2  factorial  (k  ■  3,  4)  or  a  2  fractional 
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Table  1.  Generations  for  selected  r  x  c  two-way  tables  showing 
the  results  of  applying  the  Shapiro-Wilk  normality  test 
to  errors,  and  to  residuals  from  an  additive  main  effects 
model  without  interactions. 


5Z  POINTS 

Z  less  than  X  less  than 
DESIGN  N  5%  SX 

ERRORS  RESIDUALS 


3x6 

18 

4x5  ' 

20 

3x7  ! 

21 

3x8 

24 

4x6 

24 

5x5 

25 

3x9 

27 

4x7 

‘28 

3  x  10 

30 

5x6 

30 

4x8 

32 

5x7 

35 

4x9 

36 

6x6 

36 

4  x  10 

40 

5x8 

40 

6x7 

42 

5x9 

45 

6x8 

48 

7x7 

49 

10Z  POINTS 

Z  less  than 
10Z 
ERRORS 

Z  less  than 
10Z 

RESIDUALS 

N10 

Table  2. 


Generations  for  selected  factorial  designs  showing  the 
results  of  applying  the  Shapiro-Wilk  normality  test  to 
the  residuals  from  an  additive  main  effects  modal  without 


interactions . 


DESIGN 

N 

2* 

16 

22  *  4 

16 

2  x  32 

18 

22  x  5 

20 

23  x  3 

24 

2x3x4 

24 

2*  x  6 

24 

33 

27 

2x3x5 

30 

23  x  4 

32 

2  x  42 

32 

22  x  32 

36 

32  x  4 

36 

2x3x6 

36 

23  x  5 

40 

2x4x5 

40 

32  x  5 

45 

22x  3  x  4 

48 

3  x  42 

48 

2x4x6 

48 

23  x  6 

48 

52  POINTS 


Proportion  less  N{ 
than  5Z 


10Z  POINTS 


Proportion  less 
than  10Z 


Table  3.  Generation*  tor  selected  tectorial  designs  showing  the 
results  of  applying  the  Shapiro-Wilk  normality  test  to 
the  residuals  from  an  additive  main  effects  model  with 


first  order  interactions. 


DESIGN  N 


Residual 

d.f. 

v 


5Z  POINTS 


10Z  POINTS 


than  5Z 


.033 

.289 

.024 

.294 

.025 

.026 

.289 

.029 

.322 

.038 

.027 

.051 

.331 

.022 

.027 

.045 

.037 

.054 

.032 

.036 

.028 

.062 
044 


N5 

Proportion  lees 
than  10Z 

N10 

.061 

20 

6 

.399 

6 

22 

.054 

22 

9 

.412 

8 

29 

.058 

30 

29 

.064 

28 

12 

.439 

12 

34 

.057 

35 

13 

.459 

13 

33 

.084 

33 

38 

.061 

38 

32 

.089 

35 

15 

.449 

15 

44 

.056 

43 

44 

.055 

44 

37 

.088 

37 

43 

.079 

44 

39 

.097 

41 

>50 

.069 

>50 

>50 

.079 

>50 

>50 

.069 

>50 

46 

.100 

48 

50 

.089 

so _ 

3 


Table  4.  Generation*  for  selected  second  order  rotatable  cooposite 
designs  showing  the  results  of  applying  the  Shapiro-Wilk 


factorial  (k  »  5,  6)  with  point*  coded  as  (♦  1,  +1,...,  +1) 
plus  2k  axial  points  at  a  distance  2^“^^  from  the  centre, 
where  p  ■  0  for  k  ■  3,4  and  p  ■  1  for  k  ■  5,6  ,  plus  nQ 
centre  points  (as  tabulated).  Again  the  results  in  tables 
2-4  are  based  on  3,000  simulations. 

5.  APPROXIMATIONS  TO  N$  AND  N^  AND  RECOMfENDATIONS 

The  calculations  in  Section  4  indicate  that  the  Shapiro- 
Wilk  test  for  normality  is  incorrect  when  applied  in  the  usual 
fashion  to  correlated  residuals  but  that  the  effects  can  be  adjusted 
for  by  using  the  Shapiro-Wilk  percentage  points  with  values  N^  and 
N^q  rather  than  with  N  .  These  alternative  values  are  somewhat 
higher  than  N  so  that,  with  no  adjustment  to  N  ,  the  presence  of 
correlations  among  the  residuals  means  that  the  assumption  for  normality 
will  be  rejected  less  often  for  independent  samples  of  N  observations. 
That  is,  use  of  the  Shapiro-Wilk  teat  without  adjustment  will  usually 
result  in  a  more  conservative  test. 

An  important  but  difficult  question  is  whether  there  is  an 
attributable  pattern  to  results  of  this  type,  in  general.  For  an 
approximation  it  seems  sensible  to  seek  an  adjustment  to  N  which 
decreases  to  zero  as  the  residual  degrees  of  freedom  v  tends  to 
N  ,  so  that  no  adjustment  would  be  made  in  the  limiting  (but  impractical) 
case  where  the  model  contained  no  parameters  and  so  the  residuals  were 
independent.  For  that  reason  formulas  of  the  type 
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N  -  N  ♦  8(1  “  v/N) 


(5.1) 


were  examined  (amongst  others).  It  turned  out  that  the  (rather 
appealing)  use  of  9-5  for  N5  and  9-10  for  N1Q  provided 
an  adequate  approximation  considering  the  sampling  error  that 
occurs  naturally  in  the  generations.  Table  5  shows  the  probability 
levels  that  would  have  been  obtained^  for  the  generations  of  Table  1, 
had  the  and  $10  values  of  (5.1)  been  employed.  The  agreement 
is  excellent  for  ,  less  so  (but  conservative)  for  flj  . 

Similarly  conclusions  are  obtained  from  the  generations  in  Tables  2  -  U  . 

As  an  overall  practical  recommendation  for  this  work,  we 
thus  give  the  following  rule  of  thumb: 

To  check  the  normality  of  a  set  of  H  standardised 
residuals  from  a  regression  model,  apply  the  Shapiro-Wilk  test  but 

A 

use  the  percentage  point  for  -  N  ♦  6(1  -  v/M)  ,  where  v  is 

the  residual  degrees  of  freedom  and  a  -  8/100  is  the  desired 
significance  level. 
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Table  5.  Actual  tail  proportions  found  (for  the  generations  of 
Table  1)  lying  below  the  Shapiro-Wiik  o  percentage 
points  when  values  *  N  ♦  6(1  -  V/N)  are  used 
instead  of  N  in  the  Shapiro-Wilk  tables  for  the 
a  •  6/100  cases,  6*5  and  10. (Ideally  the  values  in 
the  last  two  columns  should  be  0.050  end  0.100 
respectively). 

Tail  areas  aohieved  by 
using 


Design 

N5 

*10 

3x5 

.043 

.092 

4x4 

.045 

.100 

3x6 

.034 

.082 

4x5 

.049 

.104 

3x7 

.030 

.078 

3x8 

.035 

.083 

4x6 

.042 

.094 

5x5 

.041 

.091 

3x9 

.027 

.072 

4x7 

.040 

.086 

3  x  10 

.032 

.084 

5x6 

.046 

.101 

4x8 

.045 

.098 

5x7 

.045 

.100 

4x9 

.048 

.099 

6x6 

.045 

.097 

4  x  10 

.035 

.096 

5x8 

.052 

.107 

6x7 

.051 

.101 

5x9 

.046 

.103 

6x8 

.043 

.095 

7  x  7 

.041 

.094 
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